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Summary. This article considers tlie application of particle filtering to continuous- 
discrete optimal filtering problems, where the system model is a stochastic differ- 
ential equation, and noisy measurements of the system are obtained at discrete 
instances of time. It is shown how the Girsanov theorem can be used for evaluat- 
ing the likelihood ratios needed in importance sampling. It is also shown how the 
methodology can be applied to a class of models, where the driving noise process 
is lower in the dimensionality than the state and thus the laws of state and noise are 
not absolutely continuous. Rao-Blackwellization of conditionally Gaussian models 
and unknown static parameter models is also considered. 
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1 . Introduction 

This article considers the apphcation of sequential importance sampling based 



ims article consiaers tne application oi sequential importance sampling oasea 
particle filtering (see, e.g. Kitagawa . 1 19961: IDoucet et al.l . l200lh to continuous- 



discrete filtering problems (jJazwinsk i. 1970i), where the dynamic model is a 



stochastic differential equation of the form 

dx(t) =f(x(i),t)dt + L(i)d/3(t), (1) 
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where x(i) G R" is the state, f : R" x R+ R" is the drift term, L{t) : 
R" X R+ 1-^ R"^* is the dispersion matrix, and f3{t) G W is an s-dimensional 
Brownian motion wi th diffusion matrix QJt). It is^assumed that the required 
conditions ( Karatzas and Shrevel . Il99l1 : ["0ksendai feoosh for existence of a strong 
solution to the equation are satisfied. In this article, we first consider the case 
where the dimensionality of the state is the same as the dimensionality of the 
Brownian motion, that is, where s — n. We also extend the results to the 
singular case where the dimensionality of the Brownian motion is less than the 
dimensionality of the state, that is, where s < n. 

The likelihood of a measurement is modeled by a probability density, 
which is a function of the state at time i^: 



yfe '-p(yfe |x(tfc)). 



(2) 



The purpose of the Bayesian optimal continuous-discrete filter is to compute the 
posterior distribution (or at least the posterior mean) of the cu rrent stat e x(tfc ) 
given the measurements up to the current time, that is ( JazwinskL ,1966 , 1970fl 



p(x(tA;)|yi,---,yfc)- 



(3) 



This kind of continuous-discrete filtering models are common in engineering 
applications, espec i ally in the fields of navigation, communication and control 
(iBar-Shalom et all I2OOII : iGrewal et all I2OOII : IStengell . Il994l: IVan Treesl Il968 . 



19711 ). In these applications, the dynamic system or a physical phenomenon can 



be modeled as a stochastic differential equation, which is observed at discrete 
instances of time with certain physical sensors. The purpose of the filtering or 
recursive estimation is to infer the state of the system from the observed noisy 
measurements. 

In this article, novel measure transformation based methods for continu- 
ous-discrcte seque ntial imp ortance resampli ng (see, e.g. 'Gordon et al.', 'l993'; 
[Kitagawa, 1996; Pi tt and Sh ephard. 1999; Do ucet et al .. 2001; Ristic ct al„ 2004 ) 
are pr esented. Some of the methods have already been presented in dSarkkal 

l2006bl [al). but here the methods are significantly extended. The metho ds are 

based on transformations of probability measures by the Girsanov theorem dKalli anpui 
19801: [ Karatzas and Shrevel Il99ll ; l0ksendai l2003h . which is a theorem from 



mathematical probability theory. The theorem can be used for computing like- 
lihood ratios of stochastic processes. It states that the likelihood ratio of a 
stochastic process and Brownian motion, that is, the Radon-Nikodym derivative 
of the measure of the stochastic process with respect to the measure of Brownian 
motion, can be represented as an exponential martingale which is the solution 
to a certain stochastic differential equation. 

Measure transformation based approaches are particularly successful in con- 
tinuous time filtering (Kallianpur, 1980), but are less common in continuous- 
discrete filtering. The general idea of using the Girsano v theorem in impor- 
tance sampling of SDEs has been presented, for example, in lKloeden and Platen 
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( 1999). Similar ideas have also been i 


jresented by several authors ( lonided. 2004 ; 


Crisan and Lyons 


. 19991 Crisan et al. 


. 19981 Crisan. 2003; Moral and Miclol. 2000l) 


Beskos et alj ( 


20061) considers exact Monte Carlo simulation of a restricted 



class of diffusion models, which are observed at discrete instances of time without 
any observation error. As shown in the discussi on of the ar t icle, t he observa- 



tion errors can be included in the model. Fea rnhead et al.l (2007) introduces 



particles filters for a class of multidimensional diffusion processes, and the used 
M onte Carlo s ampling methodology is based on the exact simulation framework 
of iBeskos et al. (2006). The difference to the present methodology is that the 
me thods of Fearnhead et al.l (l2007t) are not based on time-discretization. 

Durham and GallantI (|2002l )" considers simulated maximum likelihood estima- 



tion of parameters of discretely observed stochastic stochastic differential equa- 
tions, where all or some of the components are perfectly observed. The methods 
are based on approximating the transition dens ities of the processes and model- 
ing the unobserved sample paths as latent data. IColightlv and Wilkinson ( 20061 ) 
applies similar methodology to sequential estimation of sta te and parameters 
of stochastic differential equation models. IChib et al.l (j2004r ) considers MCMC 
based simulation of diffusion driven state space models. In the article, it is also 
shown how the methodology can be applied to particle filtering of such models. 

The advantages of the method proposed here over the previously proposed 
methods are: 

• Unlike many measure transformation based approaches the methodology 
presented here is not restricted to one-dimensional or to SDE models with 
non-singular dispersion or diffusion matrices. The state dimensionality can 
be higher than the dimensionality of the driving Brownian motion, which 
is equivalent to the case that the dispersion/diffusion matrix is singular. 

• The SDE formulation of the likelihood ratio computation allows efficient 
numerical solving of the problem. In particular, simulation based ap- 
proaches (iKloed en and Pla ten. 1999) can be applied. Of course, any other 
numerical methods for SDEs could be applied as well. 

• Dispersion (and diffusion) matrices may depend on time, that is, the driv- 
ing process can be time inhomogeneous. 

• The observation errors can be easily modeled and the model flexibility is 
the same as with discrete-time particle filtering. 

• Efficient importance distributions and Rao-Blackwellization can be easily 
used for improving the efficiency of the sampling. 



2. Continuous-Discrete Sequential Importance Resampling 

2. 1. Filtering Models 

We shall concentrate into the following four forms of dynamic models: 
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(a) Non-singular models, where the dispersion matrices are invertible and thus 
the dimensionahty of the process is the same as of the driving Brownian 
motion. The advantage of this kind of processes is that their hkelihood 
ratios can be easily evaluated using the Girsanov theorem, but the problem 
is that they are too restricted models for many applications. 

(b) Singular models, where there is non-singular type of model, which is em- 
bedded inside a deterministic differential equation model and thus the joint 
model is singular because the dimensionality of the process is higher than 
of the driving Brownian motion. This kind of models are typical in nav- 
igation and stochastic control applications, where the deterministic part 
is typically plain integral operator. Because the outer operator is deter- 
ministic, the likelihood ratios of processes are determined by the inner 
stochastic processes alone and thus importance sampling of this kind of 
process is very similar to the processes of non-singular type above. 

(c) Conditionally Gaussian models, where a linear stochastic differential equa- 
tion is driven by a model of the non-singular or singular type above. This 
kind of models can be handled such that we only sample the inner pro- 
cess and integrate the linear part using the Kalman filter. This way we 
can form a Rao-Blackwellized estimate, where the probability density is 
approximated by a mixture of Gaussian distributions. 

(d) Conjugate static parameter models, where the model contains a static pa- 
rameter in such conjugate form that certain marginalizations can be ana- 
lytically evaluated. This result in particle filter, where only the dynamic 
state is sampled and the sufficient statistics of the static parameter are 
evaluated at each update stage. 



2.2. Non-Singular and Singular Models 

Assume that the filtering model is of the form 

dx = f (x, t) dt + L{t) d/3 
Yfe ~p(yfc |x(tfc)), 

where f3{t) is a Brownian motion with positive definite diffusion matrix Q{t), L(t) 
is an invertible matrix for alH > and the initial conditions are x(0) p(x(0)). 
Further assume that we have constructed an importance process s{t), which is 
defined by the SDE 

ds ^ sis, t) dt + B{t)df3, (5) 

and which has a probability law that is a rough approximation to the filtering 
(or smoothing) distribution of the model (|4]), at least at the measurement times. 
The matrix B{t) is also assumed to be invertible for all t >0. Note that at this 
point we do not want to restrict the matrix B(t) to be the same as L{t), because 
this allows usage of greater class of importance processes as shall be seen later 
in this article. 
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Now it is possible to generate a set of importance samples from the condi- 
tioned (i.e., filtered) process x(i), which is conditional to the measurements yi:k 
using s(t) as the importance process. The motivation of this is that because the 
process s{t) already is an approximation to the optimal result, using it as the 
importance process is likely to produce a less degenerate particle set and thus 
more accurate presentation of the filtering distribution. 

Because the matrices L{t) and B{t) are invertible, the probability measures 
of X and s are absolutely continuous with respect to the probability measure of 
the driving Brownian motion f3{t) and it is possible to compute likelihood ratio 
between the target and importance processes by applying the Girsanov theorem. 
The explicit expression and derivation of this likelihood ratio is given in Theorem 
IA.2I of Appendix El 

The SIR algorithm recursion starts by drawing samples {xq } from the initial 
distribution and setting w^'' — l/N, where N is the number of Monte Carlo 
samples. The continuous-discrete SIR filter algorithm then proceeds as follows: 

Algorithm 2.1 (CD-SIR I). Given the importance process s{t) , a weighted 
set of samples {x[.'lj^, w^'^il o-nd the new measurement y^, a single step o/ con- 
tinuous-discrete sequential importance resampling can be performed as follows: 

(a) Simulate N realizations of the importance processes 



from t = tk-i to t = tk, where f3^^\t) are independent Brownian motions, 
and i = 1, . . . , N . 
(b) At the same time, simulate the log-likelihood ratios 



dA« = {f(s*«(t),t)-L(t)S-i(t)g(s«(t),t)}^ 
X L-^{t) Q-\t) d/3W(t) 

-i{f(s*«(i),t)-L(t)S-i(Og(s«(0,t)}^ 
X {L{t)Q{t) L'^it)}-' 

X {f (s*«(t), t) - L{t) B-\t) g(s«(0, t)} At, 
A«(tfc_i) =0, 



dsW =g(s(*),i)dt + B(i)d/3«(t), 
ds*(*)(i) = L(i)5-i(t)ds«(t), 



from t — tk-i to t 



tk and set 




exp 
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Note that the realizations of Brownian motions must he the same as in 
simulation of the importance processes. 

(c) For each i compute 

w^=n^,Z^pij,\4^l 

and re-normalize the weights to sum to unity. 

(d) If the effective number of particles is too low, perform resampling. 

Some practical points about the implementation: 



The importance process s{t) required by the algorithm can be obtained by 
using, for example, the extended Kalman filter (EKF). An example of this 
approach is given in Section [3. II of this article. 

The simulation of the importance processes and likelihood ratios above 
can be performed using any of the well kno wn numerical methods for sim- 
ulation of stochastic differential equations (|Kloeden and Platenl . [liii). In 



this article we have used the simple Euler-Maruyama method, which can 
be considered as a stochastic version of the Euler integration for non- 
stochastic differential equations. 

The class ^ is actually very restricted class of dynamic models, where it is re- 
quired that the probability law of the state is absolutely continuous with respect 
to the law of the driving Brownian motion. This kind of models are common in 
mathematical treatment of stochastic differential equation s and such models can 
be found, for example , in mathematical finance (see, e.g., Karatzas and ShreveL 



19911 : I0ksendall . 120031 ). However, most of the models used in navigation and 
telecommunications applications do not fit into this class, and for this reason 
the results need to be extended. 

It is also possible to construct a similar SIR algorithm for more general 
models, where there is an absolutely continuous type of model, which is em- 
bedded inside a deterministic differential equation model. This kind of models 
are typical in nav i gation , communicat i on and stocha s tic control applications 



(Bar-Shalom et al.U200ll: iGrewal et all I2OOII : IStengell . Il994 IVan Treesl Il968 



1971I ). where the deterministic part is typically a plain integral operator. Be- 
cause the outer operator is deterministic, the likelihood ratios of processes are 
determined by the inner stochastic processes alone and thus importance sam- 
pling of this kind of process is very similar to sampling of the processes considered 
above. 

Assume that the model is of the form 

— = fi(xi,X2,t), 
dx2 =f2(xi,X2,t)dt-|-L(i)d^ 

Yk ^p{yk |xi(tfc),x2(tfc)). 
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where fi(-) and f2(-) are deterministic functions, f3{t) is a Brownian motion, L{t) 
is invertible matrix and the initial conditions are Xi(0),X2(0) ~ p(xi (0), X2(0)). 
Note that because the diniensionahty of Brownian motion is less than of the 
joint state (xi X2)"^ it is not possible to compute the likelihood ratio between 
the process and Brownian motion by the Girsanov theorem directly. 

However, it turns out that if the importance process for (xi X2)"'" is formed 
as follows 

^=fl(Sl,S2,t) 

ds2 =g2(si,S2,i)dt + S(t) d/3, 

then the importance weights can be computed in exactly the same way as when 
forming importance sample of X2(t) using S2{t) as the importance process. 

The likelihood ratio expression is given in Thcorcm lA.3l of Appendix 1X1 The 
SIR algorithm is started by first drawing samples from the initial distribution 
and then for each measurement, the following steps are performed: 

Algorithm 2.2 (CD-SIR II). Given the importance process Si{t),S2{t), a 

weighted set of samples {Ti'i\_i,^2\-i'^k-i} '^^'^ ''^^'"^ measurement yk, 
a single step of continuous-discrete sequential importance resampling can be 
performed as follows: 

(a) Simulate N realizations of the importance processes 



x 



dt 

ds^'^ ^g2isf\s^'\t)dt + Bit)df3^'\t), s«(tfe_i) =x. 



(i) 

2,fe-l 



ds' 



it) 



dt 



X 



l,k-l 



(h) Simulate the log-likelihood ratios (using the same Brownian motion real- 
izations as above) 

dA« = {f2(s*«(t),s;W(t),t)-L(t)S-i(t)g2(s«W,s«(t),t)}^ 
X L-^{t)Q'\t)dp^^{t) 

-l{f2(st«(t),s;«(t),t)-i(t)S-^(t)g2(s«(t),S«(t),t)}^ 

X {L{t)Q{t)L'(t)]-' 

X {Us[^\t),sl^'^{t),t)-L{t)B"\t)^2{4\t),sf{t),t)]dt, 
A«(tfc_i)=0, 
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from t = tk-i to t ^ tk and set 




(tk) 



(8) 



exp 



(c) For each i compute 



•k 




(9) 



and re-normalize the weights to sum to unity, 
(d) If the effective number of particles is too low, perform resampling. 

The importance process s{t) required by the algorithm can be obtained by using, 
for example, continuous-discrete EKF and then extracting the estimate of the 
inner process S2{t) from the joint estimate. 

2.3. Rao-Blackwellization of Conditionally Gaussian Models 

Now we shall consider dynamic models, where a linear stochastic differential 
equation is driven by a singular or non-singular model considered in the pre- 
vious section. This kind of models can be handled such that only the inner 
process is sampled and the linear part is integrated out using the continuous- 
discrete Kalman filter. Then it is possible to form a Rao-Blackwellized estimate, 
where the probability density is approximated by a mixture of Gaussian dis- 
tributions. The measurement model is assumed to be of the same form as in 
previous sections, but linear with respect to the state variables corresponding to 
the linear part of the dynamic process. 

Dynamic models with conditionally Gaussian parts arise, for example, when 
the measurement noise correlations are modeled with state augmentation (see, 
e.g., [Gelb, 1974) . Actually, in this case, the direct application of particle fil- 
ter without Rao-Blackwellization would be impossible because the measurement 
model is formally singular. However, the Rao-Blackwellized filter can be easily 
applied to this kind of models. 

Assume that the dynamic model is of the form 



where rj and (3 are independent Brownian motions with diffusion matrices Qjj {t) 
and Qi3{t), respectively. Also assume that the initial conditions are given as: 



dxi 
dx2 
"dT 

dX3 



F(x2,X3,t)xi d^ + fi(x2,X3,t) dt + F(x2,X3,t) dJ7 



f2(x2,X3,i) 

f3(x2,X3,i)dt-|-L(t)d/3, 



(10) 



xi(0)~N(mo,Po) 

X2(0),X3(0)~P(X2(0),X3(0)), 



(11) 
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and the initial conditions of xi(0) arc independent from those of X2(0) and X3(0). 
In this case an importance process can be formed as 



dSi ==F(s2,S3,t)sidt + fi(S2,S3,t)dt+ V^(s2,S3,i)dT7, 
^=f2(s2,S3,t) (12) 

dS3 = g3(s2,S3,t) dt + B{t) d/3, 

with the same initial conditions. In both the original and importance processes, 
conditionally to the filtration of the second Brownian motion J-t — (t(/3(s),0 < 
s < t) and to initial conditions, the law of the first equation is determined by 
the mean and covariance of the Gaussian process, which is driven by the process 
r]{t). Thus, conditionally to X2 and X3 the process Xi(t) is Gaussian for all t. 
The same applies to the importance process. 

Now it is possible to integrate out the Gaussian parts of both processes. This 
procedure results in the following marginalized equations for the original process: 

- (X2 , X3 , i ) m, (t ) + f 1 (X2 , X3 , 



dt 

= f (X2 , X3 , i ) ) + ) (X2 , X3 , i ) 

dt 

+ T/(x2,X3,t)Q„(i) l^^(x2,X3,0 (13) 

-— = £2 X2,X3,t) 

dt 

dx3 =f3(x2,X3,t)dt + L(t)d/3, 

where vc\.x(fy and Px{t") are the mean and covariance of the Gaussian process. 
For the importance process we get similarly: 



dva.s(t) 

dt 
dP.(t) 

dt 



F(s2,S3,t) m.s{t) + fi(S2,S3,t) 

F (S2 , S3 , t ) F. (<) + p. (i ) F^ (S2 , S3 , 

V^(s2,S3,i)Q^(i) F^(S2,S3,i) (14) 
f2(s2,S3,i) 



dS2 

"dT 

dS3 = g3(s2,S3,t)di + 5(t) d/3. 



The models ([T^ and have now the form, where the Algorithm 12.21 can be 
used. The importance sampling now results in the set of weighted samples 

{zz;«,m«,pW,x«,x«}, (15) 
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such that the probabiUty density of the state x{t) = {xi{t),X2{t),xs{t)) is ap- 
proximately given as 

p(xi(t),X2(i),X3(t)) 

« N(xi(t) I m«,pW)5(x2(i) - x^'))5(x3(f) - xW). (16) 

i 

where 6{-) is the Dirac delta function. If the measurement model is of the form 

P(yfc |x(tfc)) = N(yfc I Hk (x2(tfc),X3(tfc)) xi{tk),Rk (x2(tfc),X3(tfc))) , (17) 

then conditionally to X2(tfc), X3(ffe) also the measurement model is linear Gaus- 
sian and the Kalman filter update equations can be applied. The resulting 
algorithm is the following: 



Algorithm 2.3 (CDRB-SIR I). Given the importance process, a set of 

importance samples {x.2\_i,yi^\_i,rnk-i^ -Pk-i'''^k-i '■ * — l)---)-^} ^nd 
the measurement yk, a single step o/ conditionally Gaussian continuous-discrete 
Rao-Blackwellized SIR is the following: 



(a) Simulate N realizations of the importance process 



^ = F(s«(i),Bi^*)m«(t) +fi(s«,s«,t) 

+ V{4\4\t)Qr,{t) V^{4\4\t) 
^=f2(s«,S«,t) 

ds« =g3(s«,s«,t)di + 5(t)d/3«, 



with initial conditions 





-i) 


('0 

= ^k-i 


Pi''>{tk- 


i) 


_ p(i) 
— ^k~l 


4\tk- 


■i) 


(i) 

~ ^2,fc-l 


4\tk- 


-i) 


(0 

~ ^Z,k-1^ 



(19) 
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(b) Simulate the scaled importance process 



F{s*/\t),sr,t)m:W^t)+f,{s;''>,s;^\t) 



dt 



+ V(s;«,s;«,i)Q,(t) V^isf\sf\t) 



(20) 



dS2 _ 

I2 , S3 , 



ds;(*) = L(i)p-i(t)ds3, 
with the same initial conditions from t = tk-i to t = tk and set 

m-«=m:«(tfc) 



-(i) 



sf\tk). 



(21) 



(c) Simulate the log-likelihood ratios (again, using the same Brownian motion 
realizations as in importance process) 

dAW = {f3(s;«(i),s;«(t),t)-L(t)S-i(t)g3(s«(0,s«(t),t)}^ 
xL-^{t)Qp\t)d(3^^{t) 



-^{H4''\t),sT{t),t)-L{t)B-\t)U4\t),4\t),t)Y 
x{L{t)Q0{t)L''{t)}-' 

X {f3{sf\t),sf\t),t) - mB-\t)gs{s^^\t),4\t),t)}dt, 
AW(tfe_i)=0, 



and set 



= exp 



)} 



(22) 
(23) 



(d) For each i perform the Kalman filter update 



Ml 



K, 



m 
P 



^fe(x2,fe,X3^^)mj, 

^(xa,x« ) ^(x« ,x« ) + P,(x« ,x« ) 
p-«/?J(x«,x«){5«}- 



(24) 



-(») 
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compute the importance weight 

-i^^=-ili4^'N(y,|M«,5«), (25) 

and re-normalize the weights to sum to unity, 
(e) If the effective number of particles is too low, perform resampling. 

The importance process can be formed, for example, by computing a joint Gaus- 
sian approximation by EKF and then extracting only the estimates correspond- 
ing to the innermost process. Note that the Rao-Blackwellization procedure can 
be often performed approximately, even when the model is not completely Gaus- 
sian. The Kalman filter steps can be replaced with the corresponding steps of 
EKF, when the model is slightly non-linear. This approach has been successfully 
applied in the context of multiple target tracking in article (Sarkka et al.. . ,2007h . 



2.4. Rao-Blackwellization of Models with Static Parameters 

Analogously to the discrete time case presented in lStorvik the procedure 



of Rao-Blackwellization can often be applied to models with unknown static 
parameters. If the posterior distribution of the unknown static parameters 9 de- 
pends only on a suitable set of sufficient statistics Tk = Tk{^{ti), . . . , x(ife), yi:fc), 
the parameter can be marginalized out analytically and only the state needs to 
be sampled. 

This kind of models arise, for example, when the measurement noise variance 
or some other other parameters of the measurement model are unknown. Two 
models of this kind are presented in Section [3. II 

Assume that the model is of the form 

dx = f (x, t, 6) dt + L{t, 9) df3 
Yk ^p{yk \ x{tk),6), 

where 9 is an unknown static parameter. Also assume that f(-) and L{-) are of 
such form that the model is either non-singular or singular model considered in 
Sections O 

Now assume that the prior distribution of 9 has some finite dimensional 
sufficient statistics Tq: 

p{9)=p{e\%), (27) 

also assume that conditional posterior distribution of 9 has sufficient statistics 
Tk = 7j;(x(ti), . . . , x(tfc), yi-fe) of the same dimensionality as Tq 

p{9 I x(<i), . . . , x(tfc), yi:fe) = p{9 I Tk), (28) 

such that there exists an algorithm $(•) that can be used for efBciently perform- 
ing the update 

Tfc =$(rfc_i,x(ife),yfc). (29) 
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Further assume that the marginal likehhood 

p(yfc|x(tfc),rfc_i) = y"p(yfc|x(tfc),0)p(0|7I._i)d0, (30) 

can be efficiently evaluated. The above conditions are met, for example, when for 
fixed x(tfc) the distribution p(0 | 7fc_i) is conjugate for the Hkelihoodp(yfe |x(tfe),0) 
with respect to 0. 

The resulting algorithm is now the following: 

Algorithm 2.4 (CDRB-SIR II). Given the importance process, a weighted 
set of samples {x^*^j^, T^^^^j^, w^*^-^} and the new measurement y^, a single step 
of continuous-discrete Rao-Blackwellized SIR with static parameters can be per- 
formed as follows: 

(a) Simulate the importance process, scaled importance process and log-likelihood 
ratio as in Alaorithm \2.1\ or \2.^ This results in the sample set {x^'^*} and 

likelihood ratios {Z^^^}. 
(h) For each i compute new sufficient statistics 

r« = <f>(r«„x«,y,), (31) 

evaluate the importance weights as 

u^^u^,Z^p(j,\4^,Tl^^, (32) 

and re-normalize the weights to sum to unity, 
(c) If the effective number of particles is too low, perform resampling. 

Actually, the sufficient statistics could be functionals of the whole state trajec- 
tory, in which case they could be simulated together with the state. 

3. Numerical Simulations 

In this section the continuous-discrete sequential importance sampling is applied 
to estimation of partially measured simple pendulum which is distorted by a 
random noise term and to estimation of the spread of an infectious disease. 
Several other applications and the more details on the presented applications 
can be found in the doctoral dissertation of iSarkkai (|2006hV 



3. 1. Simple Pendulum with Noise 

The stochastic diff erential equation for the angular position of a simple pendulum 
(|Alonso and Finn|jl980.1 . which is distorted by random white noise accelerations 
w(t) with spectral density q can be written as 

d^x 

—-ir-\-a am(x)^ wit). (33) 
at-' 
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where a is the angular velocity of the (linearized) pendulum. If we define the 
state as X = (a: dx/dt)^ and change to state space form and to the integral 
equation notation in terms of Brownian motion, the model can be written as 

da;i 

^ " (34) 
dx2 — — sin(a;i) dt + df3, 

where P{t) has the diffusion coefhcient q, which is model of the form ([5]). As- 
sume that the state of the pendulum is measured once per unit time and the 
measurements are corrupted by Gaussian measurement noise with an unknown 



2 T 2/ 2\ V^&j 

a ^ Inv-x (1^0, c^o), 

This is now a model with an unknown static parameter as discussed in Section 



The importance process can be now formed by the continuous-di screte ex- 
tended Kalman filter (EKF) (see, e.g., Ijazwinskil . 1X9701 : iGelbl . \l974 \ and the 



result is a 2-dimensional Gaussian approximation for the joint distribution of 
the state x(ife) = {xi{tk) X2{tk)Y' ■ Forming this approximation requires that 
the variance ct^ is assumed to be known, but fortunately a very rough approxi- 
mation based on the estimated af. is enough in practice. In that case the EKF 
based approximation can be constructed as follows: 

(a) Assume that the posterior distribution of a particle x*^*) {t) is approximately 
Gaussian 

x«(t)|yi:fc-i^N(m(t),P(t)). (36) 

Note that immediately after a measurement, a single sampled particle ac- 
tually has a Dirac delta distribution, which also is a (degenerate) Gaussian 
distribution. 

(b) By forming a first order Taylor series expansion to the right hand side of 
the equation (p4|) we get that after a sufhciently small time interval 5t the 
state mean and covariance can be approximated as 



m(t + 5t) = m(t) + f (m(t)) 5t 

P{t + St) = P{t) + [F{m{t)) P{t) + P{t) f ^(m(i)) + Q] St 



where f (x) = {x2 — sin(a;i))"^, F{x.) is the Jacobian matrix of f (x) and 
Q = diag(0 q). 

(c) We may now form Gaussian approximation to the state at time t + St with 
the mean and covariance above. If we continue this process recursively 
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and take limit St —^ 0, we get that we may approximate the process as 
Gaussian process with mean and covariance 

dP(t) 

= F{m{t)) Pit) + P{t) F^(mW) + Q, 

The above resuh states that between the measurements we can approximate 
the mean and covariance of the process (|34p by integrating the deterministic 
differential equations ([55)1 . The result is a Gaussian process, that is, a Gaussian 
approximation to the state process at any instance of time. 

The importance process can be now constructed as follows. For each particle 
i do the following: 

(a) Solve the approximate predicted mean and covariance at time tj, from the 
differential equations ([55]) by starting from initial conditions m{tk-i) = 
x«i, P(tfe.i) = 0. 

(b) Assuming that is known, the approximate joint distribution of the state 
and measurement is Gaussian and thus we can compute the posterior dis- 
tribution of the state in closed form. 

If the resulting approximate marginal posterior mean and covariance of X2 [tk ] are 
m2,fe and P22.k, then a suitable importance process is (assuming that sampling 
interval is At) 




dsi 
"dT ' 

dS2 : 

with initial conditions 

Sl(tfc-l) = Xi^k-l 
S2{tk-l) = X2.k^l- 

The equations for the scaled importance process can be now written as 
dt 2 



(39) 



(40) 



d'S2 = [ \l n ) ("^2,fc - X2.,k-l) dt + d/3, 



(41) 



with initial conditions 



P22,fc At 



Si(ifc-i) = xi^k-i 
S2{tk-i) = a;2,fc-i, 



(42) 
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Fig. 1. The result of applying continuous-discrete particle filter with EKF proposal to a 
simulated noisy pendulum data. 



The full state of the algorithm at time step k — 1 consists of the set of particles 

l'"fc-l' ■'-l.fc-l' •^2,fe-l' J y^-^) 

where w'^^\^ is the importance weight, x'^^\__^^x''2\_i is the state of the pendulum, 

and o'l^-l are the sufficient statistics of the variance parameter. 

Figure [1] shows the result of applying the continuous-discrete particle fil- 
ter with EKF proposal and 1000 particles to a simulated data. The data was 
generated from the noisy pendulum model with process noise spectral density 
q = 0.01, angular velocity a = 1 and the sampling step size was At = 0.1. The 
estimate can be seen to be quite close to the true signal. 

In the simulation, the true measurement variance was ct^ = 0.25. The prior 
distribution used for the unknown variance parameter was ~ Inv-x^(2, 0.2). 

The evolution of the posterior distribution of the variance parameter is shown 
in the Figure [21 In the beginning the uncertainty about the variance is higher, 
but the distribution quickly concentrates to the neighborhood of the true value. 



3.2. Spread of Infectious Diseases 

The classic model for the dynamics of infectious diseases is the SIR model (The 
model is called the SIR model, because the variables X{t), Y{t), and Z{t) denote 
the susceptible, infective and removed compartments and for this reason are 
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Fig. 2. The evolution of variance distribution in tlie noisy pendulum problem. 



often denoted as S(t), I(t ) , and R(t), respectively') (IKermack a.nd McKendrick , 
19271: lAnderson and Mavl . Il99ll: lMurravl]l99l iHethcotd . l2000l) . which is valid 
for sufficiently large N: 



dX/dt 


= -bYX/N, 


X{0) 


= Xo, 


(44) 


dY/dt 


= bY X/N - g Y, 


Y{0) 


= Yo, 


(45) 


dZ/dt 


= 9Y, 


Z{0) 


= Zo, 


(46) 



where X{t) is the number of susceptibles at time t, Xq > is the initial number 
of susceptibles, Y{t) is the number of infectives who are capable of transmitting 
the infection, Fo > is the initial number of infectives, Z{t) is the number 
of recovered or dead individuals which cannot be infected anymore, > is 
the initial number of individuals in this class, N — X{t) + Y{t) + Z{t) is the 
(constant) total number of individuals, b is the contact rate which determines 
the rate of individuals moving from susceptible class to infectious class, and g is 
the waiting time parameter such that l/(? is the average length of the infectious 
period. 

If we model the contact number a = b/g as the exponential of the Brownian 
motion, then the stochastic equations for the proportions of individuals in each 
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class can be written as ( Sarkka . 



20063): 



dx/dt — ~g exp(A) yx 

<iy/<it ^ 9 (i^pWyx ~ gy (47) 
dA = dp, 

where (3{t) is a standard Brownian motion and A = Incr. 
A suitable initial distribution for a;(0) and y(0) is 

y{0) ^Beta{ay,f3y), (48) 
x(0) = l-y(0), (49) 

where Py ^ ay. The initial conditions z(0) can be assumed to be zero without 
loss of generality. 

In the classical SIR model the values X{t), Y{t) and Z{t) are not restricted to 
integer values, and thus they cannot be interpreted as counts as such. A sensible 
stochastic interpretation of these values is that they are the average numbers 
of individuals in each class and the actual numbers of individuals are Poisson 
distributed with these means. 

Assume that the number of dead individuals is recorded. Then the number 
of the dead individuals dk on time period [tk-i,tk] has the distribution 

Pidk I {xiT),yiT) : < r < tk},N) = Poisson(dfe | iV 0^), (50) 

where 

9k = x{tk-i) - x{tk) + y{tk-i) - y{tk). (51) 

The population size N is unknown and it can be modeled as having a Gamma 
prior distribution 

p{N) = Gamma(A^ | ao, /3o), (52) 
with some suitably chosen ao and /3o. As shown in ( Sarkka . 2006lJ ) this model is 



now of such form that it is possible to integrate out the population size N from 
the equations and the Algorithm 12.41 can be applied. 

The continuous- discrete SIR filter was applied to th e classical Bombay plague 



data presented in ijKermack and McKendrickl . Il927l ). An EKF base d Gaus- 



sian p rocess approximation was used as the importance process (see, ISarkka . 



2006bl . for details) and 10000 particles was used. The prior distribution for 
proportion of initial infectives was Beta(l, 100). The population size prior was 
Gamma(10, 0.001). The waiting time parameter was assumed to be g = 1. The 
prior distribution for A(0) was N(ln(5),4). The diffusion coefficient of the Brow- 
nian motion was q = 0.001. 

The final filtered estimates of the histories of x{t), y{t), and z{t) are shown in 
Figure [31 These estimates are filtered estimates, that is, they are conditional to 
the previously observed measurements only. That is, the estimate on week t is 
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95% Quantile of z(t) 




Fig. 3. Filtered estimates of values of x{t), y{t), and z{t) from tlie Bombay data. 



the estimate that could be actually computed on week t without any knowledge 
of the future observations. The estimates look quite much as what would be 
expected: the proportion of susceptibles x{t) decreases in time and the number of 
infectives y{t) increases up to a maximum and then decreases to zero. However, 
these estimated values are not very useful themselves. The reason for this is 
that, for example, the value Xoo which is the remaining value of susceptibles in 
the end depends on the choice of g and other prior parameters. That is, these 
estimated values are not absolute in the sense that their values depend heavily 
on the prior assumptions. 

Much more informative quantity is the value dZ/At , whose filtered estimate is 
shown in FigurelH The classical estimate presented in (jKermack and McKendrick , 



19271 ) is also shown. The SIR filter estimate can be seen to differ a bit from the 



classical estimate, but still both the estimates look quite much like what would 
be expected. Note that the classical estimate is based on all measurements, 
whereas the filtered estimate is based on observations made up to that time 
only. That is, the filter estimate could be actually computed on week t, but the 
classical estimate could not. 

The filtered estimates of values a{t) are shown in Figure [5l The value can 
be seen to vary a bit on time, but the estimated expected value remains on the 
range [1.4, 1.8] all the time. As can be seen from the figure, according to the 
data the value of a{t) is not constant. This is not surprising, because the spatial 
and other unknown effects are not accounted at all in the classical SIR model 
and these effects typically affect the number of contacts. 
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Fig. 4. Filtered estima t e of dZ/dt from the Bombay data. The estimate of 
jKermacl< and IVIcKendrickLFr927|) is also shown for comparison. 




Fig. 5. Bombay plague: Filtered estimate of value a{t). 



A very useful indicator value is a{t) x{t), whose filtered estimate is shown in 
Figure [Sj In the deterministic SIR model with constant a this indicator defines 
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Fig. 6. Bombay plague: Filtered estimate of value x{t) a{t). 



the asymptotic behavior of the epidemic (see, e.g., Hethcote . 2000l ): If ax(t) < 1 
then the number of infectives will decrease to zero as f ^ oo. li ax{t) > 1 then 
the number of infectives will first increase up to a maximum and then decrease 
to zero. As can be seen from the Figure [6] the filtered estimate of the indicator 
value goes below 1 just after the maximum somewhere between weeks 15-16, 
which can be seen in Figured] That is, the estimated value of a{t) x{t) could be 
used as an indicator, which tells if the epidemic is over or not. 

Using the particles it is also possible to predict ahead to the future and 
estimate the time when the maximum of the epidemic will be reached. The 
estimate computed from the filtering result is shown in the Figure [T] Again, the 
estimates are filtered estimates and the estimate on week t could be actually 
computed on week t, because it depends only on the counts observed up to that 
time. The filtered estimate can be seen to quickly converge to the values near 
the correct maximum on weeks 15-16. It is interesting to see that the prediction 
is quite accurate already around the week 10, which is far before reaching the 
actual maximum. If this kind of prediction had been done on, for example, week 
10 of the disease, it would have predicted the time of actual epidemic maximum 
quite accurately. After the maximum has been observed, the estimate quickly 
converges to a constant value, which according to the Figure H] is likely to be 
near the true maximum. 

A very useful estimate is also the expected total number of deaths caused by 
the epidemic. This can be computed from the filtered estimates and the result 
is shown in Figure [51 In the beginning the estimate is very diffuse, but after 
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Fig. 7. Bombay plague: Filtered estimate of time of maximum of epidemic. 
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Fig. 8. Bombay plague: Filtered estimated of number of deaths. 



maximum has been reached the estimate converges near tlic; correct vahie. The 
estimate is a bit less than the observed value long before reaching the maximum, 
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which might be due to existence of two maximums in the observed data (see, 
Figure [4]). Because the second maximum is not predicted by the model, the 
extra number of deaths caused by it cannot be seen in the predictions. 



4. Discussion 



The importance processes used in the continuous-discrete particle filtering ex- 
amples are very simple and better alternatives definitely exists. In principle, 
the optimal importance process in the continuous-discrete particle filtering case 
would have the same law as the smoothing solution. Thus, constructing the im- 
portance process based on the smoothing solution instead of linearly interpolated 
filtering solutions, as in this article, could lead to more efficient particle filtering 
methods. In some cases it could be possible to construct a process, which would 
have exactly the same law as the optimal importance process. 

A weakness in the continuous-discrete particle filtering framework is that the 
importance process has to be scaled before sampling. In practice, this restricts 
the possible forms of importance processes to those having the same dispersion 
matrix as the original process. However, this a bit more general case with explicit 
scaling is treated here instead of requiring L{t) — B{t), because this leaves more 
room to the possibility that maybe the equations could be modified such that 
the scaling of the importance process would not be needed. 

Another weakness of the framework is that the time-discretization intro- 
duces biasedness to the estimation. The time-discretization is due to the usage 
of numerical integration methods for SDEs, which use discretization in time. 
However, there exis ts method for simulating SDEs without time-discretization 



(|Beskos et al.l , 120061 ) and maybe by using this kind of methods this biasedness 



could be eliminated. 

The continuous-discrete sequential importance resampling framework could 
be extended to the case of stochastic differential equations driven by more general 
martingale s, for example, ge neral Levy processes such as compound Poisson 
processes ( Applebaum . 20041 ). This would allow modeling of sudden changes 



in signals. This extension could be possible by simply replacing the Brownian 
motion in the Girsanov theorem by a more general martingale. 

It could be possible to generalize the continuous-discrete sequential impor- 
tance sampling framework to continuous-time filtering problem s. Then the ex- 
tended Kalman-Bucy filter or the unscented Kalman-Bucy filter ( Sarkka . 2006bL 



E)07 l could be used for forming the importance process and the actual filtering 
result would be formed by weighting the importance process samples properly. 

The likelihood ratio expressions in Theorems lA.2l and lA.3Jhave an interesting 
connection to the variational method considered in article fArchambeau et al 
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20071 ) . If we select the processes as 

dx(t) =f(x(t),i)di + \/Sd/3(i) (53) 

ds(i) = fL(s(t),<)dt + \/Sd/3(t), (54) 

where x(t) is a process with density p{-) and s(t) is a process with density 
q{-), then by taking the expectation of negative logarithm of ([63]) we get the 
expression for the KL-divergence between q and p: 



KL[q\p]=E 



{f (s(t), t) - hisit), i)}^S-i{f (s(t), t) - fL(s(i), t)) di 



(55) 

which is exactly the expression obtained heuristically in Archambeau et al] ( 2007[ ). 
Thus the extensions to singular models would also apply to that method. 



5. Conclusions 

In this article, a new class of methods for continuous-discrete sequential impor- 
tance sampling (particle filtering) has been presented. These methods are based 
on transformations of probability measures using the Girsanov theorem. The 
new methods are applicable to a general class of models. In particular, they 
can be applied to many models with singular dispersion matrices, unlike many 
previously proposed measure transformation based sampling methods. The new 
methods have been illustrated in a simulated problem, where both the implemen- 
tation details of the algorithms and the simulation results have been reported. 
The methods have also been applied to estimation of the spread of an infectious 
disease based on counts of dead individuals. 

The classical continuous-discrete extended Kalman filter as well as the re- 
cently developed continuous-discrete unscented Kalman filter can be used for 
forming importance processes for the new continuous-discrete particle filters. 
This way the efiiciency of the Gaussian approximation based filters can be com- 
bined with the accuracy of the particle approximations. Closed form marginaliza- 
tion or Rao-Blackwellization can be applied if the model is conditionally Gaus- 
sian or if the model contains unknown static parameters and has a suitable 
conjugate form. In most cases Rao-Blackwellization leads to a significant im- 
provement in the efficiency of the particle filtering algorithm. 



A. Likelihood Ratios of SDEs 



In the computation of the likelihood ratios of stochastic difi^erenti al equations 
we n e ed a slightly generahzed v e rsion of t h e Gir sanov theorem (jKallianpur . 
19801 : iKaratzas and Shrevel . Il99ll : l0ksenda]| . l2003h . The generalized theorem 



can be obtained, for e xample, as a special case from the theorems presented in 
Delvon and h1] (|2006l ). 
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Theorem A.l (Girsanov). Let (3 ~ (/3i, . . . , /3d) be a Brownian motion 
with diffusion matrix Q{t) under the probability measure P. Let 6 : fix M_|_ i— > M.'^ 
be an adapted process such that the process Z defined as 



Z{t) = exp<^ / 6>^(i)d/3(t) 



1 



e'^{t) Q{t)e{t)dt 



satisfies E[Z(t)] = 1. Then the process 

d^{t) = df3{t) - Q{t)e{t)dt 



(56) 



(57) 



is a Brownian motion with diffusion matrix Q{t) under the probability measure 
P defined via the relation 



E 



"dP 






dP 



Z{t), 



(58) 



where Tt is the natural filtration of the Brownian motion (3{t). 



Proof. See, for example. iDelvon and Hd ([20061). 

Theorem A. 2 (Transformation of SDE Solutions I). Let 

dx(t) ^ f (x(t), t) dt + L{t) d/3(t), x(0) = xq 
ds{t) = g(s(t), <) dt + B{t) d(3{t), s(0) = a;o, 



(59) 
(60) 



where (3{t) is a Brownian motion with diffusion matrix Q{t) with respect to 
measure P. Let Tt be its natural filtration. The matrices L{t) and B{t) are 
assumed to be invertible for all t. Now the process s*{t) defined as 



ds* ^ L{t) B-\t)ds, s(0)=xo 



is a weak solution to the Equation (j59p under the measure 
relation 

■"dP 



E 



(61) 

defined by the 
(62) 



where 

Z{t) = exp 



{f (s*(t), t) - L{t) B-'{t) g{m,t)f L-^{t) Q'\t) mt) 
{f(s*(t),t)-L(t)i?-i(t)g(s(t),t)}^ 



X {L{t)Q{t)L^{t)} \{{s*{t),t)-L{t)B-\t)g{s{t),t)}dt 



(63) 
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Proof. By substituting the expression (|60p into Equation (|6ip . solving for 
d/3(t), we get 

d/3(i) = L-^O ds* - (t) g(s(t), t) dt. (64) 

If we now define 

e{t) = O-i(t) L-^t) f (s*(<), i) - Q-i(t) B-\t) g(s(i), t), (65) 

then under the measure P defined by and ([55)1 with the process 9{t) defined 
as above, the following process is a Brownian motion with diffusion matrix Q(t): 



(66) 



d/3(i) = d/3(t) - Q{t)e{t)dt 

= L-\t) ds* - B-\t)g{s{t),t) dt 

- Q{t) Q-\t) L-i(t)f(s*(t),i) dt+Q{t) Q-\t)B-\t)g{sit),t)dt 
= L-\t) ds* - L-\t){{s*{t),t) dt 

By rearranging we get that 

ds* = {{s*{t),t)dt + L{t)d^{t) (67) 

and thus the result follows. The explicit expression for the likelihood ratio is 
given as follows: 

Z{t) = exp [ f {Q-\t)L-\t)i{s*{t),t) - Q-\t)B-\t)g{s{t),t)f dm 



{Q-\t)L-Ht)i{s*{t),t)-Q-\t)B-\t)g{s{t),t)y 



Q{t){Q-\t) L-\t){is*it),t) - Q-\t) B-\t)gisit),t)} dt 



exp 



exp 



{f (s*(t), t) - Lit) B-\t) g(s(i), t)Y L-^it) Q-\t) d/3(i) 

' {L-~\t) {{s*{t),t) - m B-\t) g(s(t), t)f 

X {L-^{t) Q-\t) Q{t)Q{t)-^ L-\t)} 
X {f (s*(i), t) - L{t) B-\t) g(s(t), t)} dt 

{i{s*{t),t) ~ L{t) B-\t) g(s(t), t)Y L-^it) Q'\t) d/3(t) 



{i{s*{t),t)~L{t)B-\t)g{s{t),t)y 



X {L{t) Q{t) [t)} \i{s*{t),t) - L{t) B-\t)g{s{t),t)} dt 



(68) 
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Theorem A. 3 (Transformation of SDE Solutions II). Assume that pro- 
cesses xi(i), X2(t), Si(t) and S2(t) are generated by the stochastic differential 
equations 



dxi 
"dT 
dx2 
dsi 
IT 

dS2 



fl(xi,X2,t), 

f2(xi,X2,t) dt + L{t) dj3, 

fl(si, S2, i), 

g2(si,S2,t) dt + B{t)d(3, 



xi(0) 


= Xi,o 


(69) 


MO) 


= X2,0 


(70) 


si(0) 


= xi,o 


(71) 


S2(0) = 


= X2,0, 


(72) 



where L(t) and B(t) are invertible matrices for all t > and under the measure 
P, f3{t) is a Brownian motion with diffusion matrix Qit). Then the processes si 
and S2 defined as 



dsl 



fl(Sl, 82:^)7 



ds* = L{t)B-\t) ds2, 



st(0) =xi,o 

83(0) = X2,0 



(73) 
(74) 



are weak solutions to the Equations (|69[) and (I70p under the measure P defined 
by the relation 



E 



"dP 






dP 



= z{t). 



(75) 



(76) 



where 

Z(t)=cxp[ / {{2{sl{t),s;{t),t) - L{t) B-\t)g2{siit),S2{t),t)f 
'-Jo 

X L-^{t) Q-\t)df3{t) 

-\f^ {f2(sj (t), S*(t), <) - m B-\t) g2(si(t), S2{t), t)f 

X {L'^{t)Q{t)L{t)}-' 

X {f2(sj (t), s;it), t) - L{t) B~\t) g2(si(t), S2{t), t)} dt 

Proof. From equations dZH), GSl) and ^ we get that 

d/3(t) - L-i(t) ds* - B-i(t) g2{si{t),S2{t),t)dt. 
If we now define 

e{t) = Q-\t) L-'{t)i2{sl{t),s*2it),t) - Q-\t) B-\t)g2{si{t),S2{t),t), (78) 



(77) 
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then similarly as in proof of Theorem lA. 21 we get that the process 0{t) defined 
as 

d/3(i) =d/3(t)- Q{t)e{t)dt 

= L-\t) ds* - B-\t) g2(si(t), S2(t), t) dt 

- Q{t)Q-\t)L-\t)i2{sl{t),s;{t),t)dt (79) 

+ Q{t) Q-\t) B-\t) g2(si(t),S2(i), t) At 

= L-\t)ds*^ L-'^{t){2{sl{t),s*{t),t)dt 

is a Brownian motion with respect to measure P and thus s| and 83 are the weak 
solutions to the equations (|69|) and ((70)) . 

References 

Alonso, M. and E. J. Finn (1980). Fundamental University Physics, Volume I: 
Mechanics and Thermodynamics (2nd ed.). Addison- Wesley. 

Anderson, R. M. and R. M. May (1991). Infectious Diseases of Humans: Dy- 
namics and Control. Oxford University Press. 

Applebaum, D. (2004). Levy Processes and Stochastic Calculus. Cambridge 
University Press. 

Archambeau, C, D. Cornford, M. Opper, and J. Shawe- Taylor (2007). Gaus- 
sian process approximations of stochastic differential equations. Journal of 
Machine Learning Research Workshop and Conference Proceedings 1, 1-16. 

Bar-Shalom, Y., X.-R. Li, and T. Kirubarajan (2001). Estimation with Applica- 
tions to Tracking and Navigation. Wiley Interscience. 

Beskos, A., O. Papaspiliopoulos, G. Roberts, and P. Fearnhead (2006). Exact and 
computationally efficient likelihood-based estimation for discretely observed 
diffusion processes (with discussion) . Journal of the Royal Statistical Society, 
Series B 68(3), 333 - 382. 

Chib, S., M. K. Pitt, and N. Shephard (2004). Likelihood based inference for 
diffusion driven models. Volume W20. 

Crisan, D. (2003). Exact rates of convergence for a branching particle approxi- 
mation to the solution of the Zakai equation. Ann. Probab. 31(2), 693-718. 

Crisan, D., J. Gaines, and T. Lyons (1998, October). Convergence of a branch- 
ing particle method to the solution of the zakai equation. SIAM Journal on 
Applied Mathematics 58(5), 1568-1590. 



28 



Crisan, D. and T. Lyons (1999). A particle approximation of the solution of the 
Kushner-Stratonovitch equation. Probab. Theory Relat. Fields 115, 549-578. 

Delyon, B. and Y. Hu (2006). Simulation of conditioned diffusion and application 
to parameter estimation. Stochastic Processes and their Applications 116, 
1660-1675. 

Doucet, A., N. de Freitas, and N. Gordon (2001). Sequential Monte Carlo Meth- 
ods in Practice. Springer. 

Durham, G. B. and A. R. Gallant (2002, July). Numerical techniques for maxi- 
mum likelihood estimation of continuous-time diffusion processes. Journal of 

Business & Economic Statistics 20(3), 297-316. 

Fearnhead, P., O. Papaspiliopoulos, and G. O. Roberts (2007). Particle filters for 
partially observed diffusions. Journal of the Royal Statistical Society, Series 
B. to appear. 

Gelb, A. (1974). Applied Optimal Estimation. The MIT Press. 

Golightly, A. and A. Wilkinson (2006). Bayesian sequential inference for nonlin- 
ear multivariate diffusions. Statist. Comput. 16(4), 323-338. 

Gordon, N. J., D. J. Salmond, and A. F. M. Smith (1993). Novel approach to 

nonlinoar/non-Gaussian Bayesian state estimation. In IEEE Proceedings on 
Radar and Signal Processing, Volume 140, pp. 107-113. 

Grewal, M. S., L. R. Weill, and A. P. Andrews (2001). Global Positioning Sys- 
tems, Inertial Navigation and Integration. Wiley Interscience. 

Hethcote, H. W. (2000). The mathematics of infectious diseases. SI AM Re- 
view 42(4), 599-653. 

lonides, E. L. (2004). Inference and filtering for partially observed diffusion 
processes via sequential Monte Carlo. Technical report. University of Michigan 
Statistics Department Technical Report #405. 

Jazwinski, A. H. (1966). Filtering for nonlinear dynamical systems. IEEE Trans- 
actions on Automatic Control 11(4), 765-766. 

Jazwinski, A. H. (1970). Stochastic Processes and Filtering Theory. Academic 
Press. 

Kallianpur, G. (1980). Stochastic Filtering Theory. Springer- Verlag. 

Karatzas, I. and S. E. Shreve (1991). Brownian Motion and Stochastic Calculus. 
Springer. 



29 



Kermack, W. O. and A. C. McKcndrick (1927). A contribution to the math- 
ematical theory of epidemics. Proceedings of the Royal Society of London, 
Series A 115, 700-721. 

Kitagawa, G. (1996). Monte Carlo filter and smoother for non-Gaussian nonlin- 
ear state space models. Journal of Computational and Graphical Statistics 5, 
1-25. 

Kloeden, P. E. and E. Platen (1999). Numerical Solution to Stochastic Differ- 
ential Equations. Springer. 

Moral, P. D. and L. Miclo (2000). Branching and interacting particle systems. 

Approximations of Fcynman-Kac formulae with apphcations to non-linear fil- 
tering. In Srhinaire de Prohahilites XXXIV. Springer- Vcrlag. 

Murray, J. D. (1993). Mathematical Biology, Vohimc 19. Springer. 

0ksendal, B. (2003). Stochastic Differential Equations: An Introduction with 
Applications (6 ed.). Springer. 

Pitt, M. K. and N. Shephard (1999). Filtering via simulation: Auxiliary particle 
filters. Journal of the American Statistical Association 94(446), 590-599. 

Ristic, B., S. Arulampalam, and N. Gordon (2004). Beyond the Kalman Filter. 
Artech House. 

Sarkka, S. (2006a, September). On sequential Monte Carlo sampling of dis- 
cretely observed stochastic differential equations. In Proceedings of NSSPW , 
Cambridge, September 2006. 

Sarkka, S. (2006b). Recursive Bayesian Inference on Stochastic Differential 
Equations. Doctoral dissertation, Helsinki University of Technology. 

Sarkka, S. (2007). On unscented Kalman filtering for state estimation of 
continuous-time nonlinear systems. IEEE Transactions on Automatic Con- 
trol 52(9), 1631-1641. 

Sarkka, S., A. Vchtari, and J. Lampincn (2007). Rao-BlackwcUizcd particle filter 
for multiple target tracking. Information Fusion Journal 8(1), 2 15. 

Stengel, R. F. (1994). Optimal Control and Estimation. Dover Publications, Inc. 

Storvik, G. (2002). Particle filters in state space models with the presence of 
unknown static parameters. IEEE Transactions on Signal Processing 50(2), 
281-289. 

Van Trees, H. L. (1968). Detection, Estimation, and Modulation Theory Part I. 
John Wiley & Sons, New York. 

Van Trees, H. L. (1971). Detection, Estimation, and Modulation Theory Part II. 
John Wiley & Sons, New York. 



30 



B. Acknowledgment 

The author would hke to thank Aki Vehtari, Jouko Lampinen and Ilkka Kalliomaki 
for helpful discussions and comments on the manuscript. 



31 



